Estimation of the Basic Reproduction Number and Vaccination Coverage of Influenza in the United States (2017-18)

Background: Determining the epidemic threshold parameter helps health providers calculate the coverage while guiding them in planning the process of vaccination strategy. Since the trend and mechanism of influenza is very similar in different countries, we planned a study with the mentioned goal by using data of US from 2017 to 2018. Study design: A secondary study. Methods: R0 and corresponding vaccination coverage are estimated using the national and state-level data of the US from the 40th in 2017 to the 5th week in 2018. Four methods maximum likelihood (ML), exponential growth (EG), time-dependent reproduction numbers (TD), and sequential Bayesian (SB) are used to calculate minimum vaccination coverage. The gamma distribution is considered as the distribution and the generation of time. Results: The peak of epidemy in most states has occurred in the 15th week after the beginning of the epidemics. The generation time obey the Gamma distribution with mean and standard deviation of 3.6 and 1.6, respectively, was utilized for the generation time. The R0 (vaccination coverage) equaled 1.94 (48.4%), 1.80 (44.4%), 3.06 (67.3%), and 2.11 (52.6%) for EG, ML, SB, and TD methods at the national level, respectively. Conclusion: The R0 estimations were in the range of 1.8-3.06, indicating that an epidemic has occurred in the US (R0>1). Thus, it is required to vaccinate at least 44.4% to 67.3% to prevent the next epidemics of influenza. The findings of this study assist futures studies to apply disease control by vaccination strategies in order to prevent a national disaster.


Introduction
linical research and studies of communicable (infectious) diseases follow different purposes such as determining the trend, epidemic threshold parameters, and vaccination coverage. The epidemic threshold parameter, R0, plays a key role in the diagnosis of suggested control strategies in order to apply interventions or vaccination preventative strategies. In biostatistics and epidemiology, epidemic threshold parameter (whose special form is known as the basic reproduction number or reproductive ratio) defined as the mean number of secondary cases infected by initial cases in a fully susceptiblecommunity [1][2][3] . The basic reproduction number is generally compared with unity to assess the spread of infectious diseases to the population. An R0 greater than unity (R0>1) means an epidemic has occurred and each infected individual generates more than one new case 4 . In addition, the epidemic likely fades out when the basic reproduction number becomes less than unity (R0<1) and the R0 equals1, leading to an epidemic [1][2]5 .
Several approaches have been suggested to estimate R0. These include maximum likelihood (ML), exponential growth rate (EG), estimation of time-dependent reproduction numbers (TD), attack rate, gamma-distributed generation time, the final size of epidemic, and Richard model [6][7][8][9][10][11][12][13][14] . The type of approach depends on the type of data which is being studied (the type of household or daily incidence data). In each method, the basic reproduction numbers are reviewed to assess the intensity of interventions and vaccination strategies that can estimate the vaccination coverage of an infectious disease. Therefore, vaccination strategies are introduced in order to reduce and prevent the risk of transmission of infectious diseases in the target community. In addition, R0 and vaccination coverage have a direct impact on each other, which means that any variation in R0leads to a corresponding variation in vaccination coverage and vice versa. Therefore, a larger number of people should be vaccinated in a susceptible population where the estimation of basic reproduction number is a larger number.
The estimation of the basic reproduction number (R0) has been addressed in various infectious diseases, including influenza, HIV, SARS, smallpox, malaria, yellow fever, measles, and Ebola [15][16][17][18][19][20][21][22] . In particular, influenza is a leading cause of mortality, with a considerable number of annual deaths in the world 23 . Several influenza epidemics have occurred worldwide from 2009 to 2017 during which a substantial number of people died annually 24 . For example, the number of deaths caused by "Asian flu" and "Hong Kong flu" is estimated at at1 to 4 million 25 . On the other hand, the annual C deaths attributed to influenza are estimated at nearly 250,000 to 500,000 globally 26  An epidemic has recently occurred in the US. Of course, the epidemic of influenza has been recorded in the US every year over the past 20 years, showing a yearly seasonal threat of the influenza epidemic 28 (Table 1). A few of these epidemics are listed below: In Philadelphia and New York (Sep 14 to Oct 17, 1918), the estimated R0 (95%CI) and generation time were 2.14 (1.88, 2.39) and 2.5 d, respectively. The formula used for estimating R0 was R t = ∑ IR t+i w i i>0 29 . In the USA    The R0s reported above have been obtained using different methods and generation times and are, therefore, impossible to be compared. Therefore, another aim of the present study was to calculate the R0 of influenza for a given set of data (USA data) considering the same distribution for generation time by different methods to be able to compare various approaches. The number of cases and generation time distribution is needed which is gamma distribution with mean and SD 3.6 and 1.6 respectively based on similar study.
In addition, determine the epidemic threshold, helps health providers calculate the coverage while guiding them in planning the process of vaccination strategy. Vaccination coverage is directly computed by R0, called indirect effect/herd protection 33 . Indirect vaccination coverage is not only economic but also prevent epidemic which its effect exceed the direct effect 34 .
Since the trend of influenza is very similar in different years, we conducted this study to determine the epidemic threshold parameters and, consequently, the vaccination coverage in the US from 2017 to 2018.

Statistical Analysis
In the first step, the classic SIR (susceptible, infected/infectious, and removal) compartmental model used to describe the process of influenza was implemented to determine the protocol of the person transmission indifferent states.
Next, four methods were developed to estimate the R0 based on cumulative case count data in R statistical software (version 3.4.2) with R0 packages, including ML, EG, sequential Bayesian method (SB),and TD. All the methods were used in different papers for influenza data, so for this type of data all the methods could be used to estimate the R0.
It is necessary to have a distribution for the generation time in each method defined below. Moreover, a brief overview of each method is presented.

Generation time
The length of time between infection in a primary infection and a secondary infection is defined as generation time or serial interval 35 .

EG:
The R0 formula in the exponential growth rate method . Here, "r", "M", and Nt demonstrate the growth rate of the infection population, the moment generating function of the generation time distribution, and cases over a consecutive time unit, respectively, and parameter "w" represents the generation time. In order to estimate the growth rate parameter, Poisson regression method is applied 7 .

ML:
In this method, the distribution of secondary cases infected by primary cases assume Poisson with mean R. Suppose N 0 , N 1 , … , N T represent cases over time and parameter w shows generation time. Then, the log-likelihood function based on Poisson distribution is as follows: The maximum of log-likelihood function gives the reproduction number (R) 36 .

SB:
Suppose N(t+1) denotes incidence in time (t+1) for the SIR model where we have an approximate Poisson distribution with mean N(t)e γ(R−1) (ϒ shows the average generation time). A non-informative prior for R is used in the Bayesian framework. The posterior distribution of R in the previous day is applied as the prior distribution for R in the new day. The posterior distribution for R is as follows: The exponential distribution applies for generation time in this method 11 .
In this formula, Pij demonstrates the probability of infection transmission form casei (in time ti) to case j (in time tj). Rt is the mean of all Rjcomputed by all networks of observed cases 6 .

Vaccination coverage
To compute the percent of vaccination coverage, we need to estimate R0. Therefore, the critical vaccination coverage, i.e. the proportion of people who receive vaccines, is obtained by the reproduction number using the following formula: Vaccination coverage is also defined as the reduction in the probability of infection risk which is a value between 0 and 1 33 .
Vaccine efficacy for reducing transmission can be achieved by

Data
The four above-mentioned models were fitted to the US 2017-18 pH1N1 data, applying FluView weekly report achieved from the Centers for Disease Control and Prevention (CDC) website 38 . The data of Surveillance Network (ILINet) were implemented, reporting influenza cases in all 47 states, the District of Columbia, New Dakota, New York City, Puerto Rico, and the U.S. Virgin Islands for each age group in the 40 th week in 2017 to 5 th week in 2018.
For all states, we estimated the R0 based on four methods. Afterward, we calculated the variances of R0s for each method to check for variability among different states, and then hierarchical cluster analysis was applied as an explorative technique to specify the number of clusters in the K-means clustering method to cluster the states. Cluster analysis was performed in Minitab 17 statistical software (Minitab Inc., State College, PA) and IBM SPSS Statistics 22 (Chicago, IL, USA).

Results
The incidence data are presented on a weekly basis and all dates of USA data are based on week/year from the 40 th week in 2017 to the 5 th week in 2018. The peak of epidemy in most states has occurred in the 15 th week after start of the epidemics. The number of infected cases at the national level is provided in Figure 1. The number of infected cases was plotted at the national level in five age groups. Maximum numbers of cases were in the age group of 5 to 24 yr and the peak incidence of influenza in this category occurred in the 19 th week (Figure 2). The gamma distribution with the mean of 3.6 d and standard deviation of 1.6 d has been used as the distribution of the generation time 1 . The results will be presented in three parts: national level, state level, and comparison of methods.

National level
The national R0 and vaccination coverage are summarized in Table 1 based on four methods (ML, EG, SB, and TD) while assuming a wholly susceptible population before the start of the epidemy.
At the national level, the highest value of R0 was attributed to SB. Indeed, the estimated R0 at the national level by SB was quite different compared to other methods (3.057 95% CI: 3.037, 3.08). Moreover, the estimated R0s for EG, ML, and TD were 1. 939 (1.937, 1.940), 1.80 (1.789, 1.802), and 2.111 (2.102, 2.120), respectively. In addition, the estimates of vaccination coverage varied for the four methods, from 44.4% to 67.3%. The lowest and highest vaccination coverage values in this setting were associated with and SB methods, respectively (Table1).

States Level
The computed R0s (95% CI) and vaccination coverage for all states by EG, ML, SB, and TD are summarized in Table 1 in the appendix. In general, the estimates of R0 at state level ranged from 1.55 to 2.79 for EG, 1.48 to 2.65 for ML, 1.62 to 2.46 for SB, and 1.67 to 2.73 for TD.
For each method, the variance of R0among states was calculated and the equality of variance with zero was tested by the chi-squared test ( Table 1). The results for all methods indicated a variation in R0s among states because the reported confidence intervals did not include zero. Therefore, the states can be clustered using cluster analysis.
In the second cluster, the R0s calculated by four methods had the minimum value for Arkansas. New Hampshire had the largest R0 for EG and ML. Moreover, the R0 was maximum in Oklahoma for SB and TD. In total, the estimated R0 (95% CI) in the second cluster equaled 1. 818 (1.817, 1.819)  Despite the low vaccination coverage percent determined in Arkansas, there were states where the vaccination coverage had a large calculated percent. Therefore, at the state level, the percentages of vaccination coverage calculated using EG, ML, SB, and TD were the lowest for Arkansas (EG: 35.5%, ML: 32.4%, SB: 38.3%, and TD: 40.1%). Furthermore, the highest vaccination coverages (EG:64.1%, ML:62.3%, and TD:63.4%) were associated with Delaware.

Comparison of methods
Except for the national level, the R0 across all the states (in both clusters) had an approximately identical estimation for both SB and TD. In other words, the estimated R0 using SB were consistent with that calculated using TD. The R0s estimated by ML were slightly less than those estimated using EG. The variance (95% CI) of R0s among these four approaches equaled 0.32 (0.10, 4.51), indicating variability. Cluster analysis based on the mentioned methods resulted in two clusters. The first cluster included EG and ML, while the second cluster comprised SB and TD. This analysis confirmed the above findings.

Discussion
The simple SIR compartmental model is used as the transition model, indicating that the estimation of R0using ML, EG, SB, and TD varied in different states due to the difference in the number of infectious cases during the outbreak. The variability of R0 depends on many factors, including location, estimation method, generation time, and pandemic wave 10,39 . The virus and network size are also influential factors in influenza transmission 10 . The peak value of outbreak was the same in most states. Moreover, a sharp peak was observed in the incidence of H1N1 at the national level (Figure1).
We have found variation in the estimation of R0 using ML, EG, SB, and TD implemented by the "R0 package". A quantitative comparison of findings revealed that the estimations of R0 in SB are approximately close to those of TD. Besides, EG and ML yielded almost identical results, and cluster analysis based on the four methods confirmed this hypothesis.
The estimated epidemic threshold values based on three methods (EG, ML, and TD) in the first cluster were higher than those of the second cluster. Consequently, states in the first cluster have a higher risk of epidemic and require more vaccination coverage.
Generally, the R0associated with ILINet using four methods was greater than the one at the national level (winter of 2017) as well as state level (in both clusters), representing the epidemic of influenza. Therefore, it seems necessary to consider appropriate solutions to control, decrease, and prevent the epidemic or pandemic of influenza. An effective way to protect people from the attack rate of influenza is vaccination. Annual vaccination against seasonal influenza provides protection in high-risk groups (elderly people, ill persons, pregnant woman, and children) and can also reduce mortality rate, the incidence of disease, exacerbations, hospitalizations, and costs.
In determining vaccination coverage, R0 plays a key role because the estimation of vaccination coverage is affected by R0 (v=1-1/R0). In other words, the percentage of a community vaccinated against influenza can be represented in terms of R0. Vaccination coverage and R0directly impact each other, which means that, with an increase in R0, vaccination coverage increases, and vice versa. The present study also provides an estimate of vaccination coverage for both national and state levels, which is one of the strengths of this study.
The R0 of influenza for USA ranged between 1.3 and 3.1 from 1918 to 2013 using various methods [35][36][38][39] . In our study, similar values were obtained for R0. For example, at the national level, the R0wasestimated using four methods (ML, EG, SB, and TD) and their values were in the range of 1.8 to 3.06, indicating that an epidemic occurred in USA (R0>1).
Various studies have employed different methods and generation times to estimate the threshold of epidemics. Thus, it would be illogical to make such comparisons. In our study, the R0 of influenza for USA data considering the same distribution for the generation time in different methods so that various approaches can be compared, which is another strength of this study.
A weakness of this study was that, although cluster analysis determined similar methods, there was no exact criterion for determining the best method.

Conclusion
The findings of our study can be used to improve policymaking, health care, and public health not only in the USA but also in other parts of the world. These results can be extended to other countries with similar epidemics. As the transmission mechanism is the same, the influenced parameters of the disease should be the same. In other words, the epidemic of influenza is similar in all countries so, in our country by at least 44.4% of vaccination can prevent the flu outbreak. Hence, awareness of the R0 of influenza as a highly infectious disease is helpful for futures studies to apply disease control through vaccination strategies in order to prevent a national disaster. Indirect vaccination coverage is not only economic but also prevent epidemic which its effect exceed the direct effect 34 .
Influenza would become re-epidemic. Therefore, a more comprehensive study is needed to deal with this dangerous virus.